Breakdown of the Internet under intentional attack 



O 

o 

(N 



S3 

S3 
i 

C/3 



C3 



I 

S3 
O 

o 



(N 
> 

in 

o 

o 
o 

03 



i 

S3 
O 

o 



X 



Reuven Cohen 1 *, Keren Erez 1 , Daniel ben-Avraham 2 , and Shlomo Havlin 1 
1 Minerva Center and Department of Physics, Bar-Ran university, Ramat-Gan, Israel 
2 Department of Physics, Clarkson University, Potsdam NY 13699-5820, USA 

We study the tolerance of random networks to intentional attack, whereby a fraction p of the 
most connected sites is removed. We focus on scale-free networks, having connectivity distribution 
P(k) ~ k~ a (where k is the site connectivity), and use percolation theory to study analytically and 
numerically the critical fraction p c needed for the disintegration of the network, as well as the size 
of the largest connected cluster. We find that even networks with a < 3, known to be resilient to 
random removal of sites, are sensitive to intentional attack. We also argue that, near criticality, the 
average distance between sites in the spanning (largest) cluster scales with its mass, M, as y/M, 
rather than as log fc M, as expected for random networks away from criticality. Thus, the disruptive 
effects of intentional attack become relevant even before the critical threshold is reached. 
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The question of stability of scale- free random networks 
to removal of a fraction of their sites, especially in the 
context of the Internet, has been recently of interest jtp|§ . 
The Internet can be viewed as a special case of a ran- 
dom, scale-free network, where the probability of a site 
to be connected to k other sites follows a power-law: 
P(k) ~ k~ a (a ss 2.5, for the Internet). It is now well 
established that if a fraction p of the sites is removed ran- 
domly, then for a > 3 there exists a critical threshold, 
Pc such that for p > p c the network disintegrates; net- 
works with a < 3 are more resilient and do not undergo 
this transition, although finite networks (such as the In- 
ternet) may be eventually disrupted when nearly all of 
their sites are removed, as shown numerically in [|l],^), 
and analytically in 

Albert et al, § have introduced a model for inten- 
tional attack, or sabotage of random networks: the re- 
moval of sites is not random, but rather sites with the 
highest connectivity are targeted first. Their numerical 
simulations suggest that scale-free networks are highly 
sensitive to this kind of attack. In this Letter we study 
the problem of intentional attack in scale-free networks. 
Our study focuses on the exact value of the critical frac- 
tion needed for disruption and the size of the remaining 
largest connected cluster. We also study the distance 
between sites on this cluster near the transition. We 
find, both analytically and numerically, that scale-free 
networks are highly sensitive to sabotage of a small frac- 
tion of the sites, for all values of a, lending support to 
the view of Albert et al, Q. 

In a recent paper || we have studied the properties of 
the percolation phase transition in scale- free random net- 
works, and applied a general criterion for the existence 
of a spanning cluster (a cluster whose size is proportional 



to the size of the network) 
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Here k is the site connectivity, and averages, indicated by 
angular brackets, are taken over all sites of the network. 
When a fraction p of the sites are randomly removed (or 
a fraction p of the links are removed, or lead to deleted 
sites), the distribution of site connectivity is changed 
from the original P(k) to a new distribution P(k): 
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P(k)= ^P(fco) ° (1-P)V° 
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Using this criterion together with Eq. 
threshold p = p c is found to be: 
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fpj), the critical 
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where kq = (fco)/(fco) is calculated from the original con- 
nectivity distribution, before the removal of any sites . 

A wide range of networks, including the Internet, have 
site connectivities which follow a power-law distribu- 
tion @g§: 



P(k) = ck~ 



k = to. m + 1, 



(4) 



where k — m is the minimal connectivity and k = K is 
an effective connectivity cutoff present in finite networks. 
For the distribution (^), kq can be approximated by J(|: 
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This, together with Eq. (2), was used to show that net- 
works with a < 3, which have a divergent second mo- 
ment, are resilient to random deletion of sites [Q. Indeed, 
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when the number of sites in such networks N — > oo, then 
the upper cutoff K — > oo, and there exists a spanning 
cluster for all values of p < 1. Another approach, based 
on generating functions, was introduced in J8| and was 
used to study a similar problem in || . 

Consider now intentional attack, or sabotage 0, 
whereby a fraction p of the sites with the highest con- 
nectivity is removed. (The links emanating from the 
sites are removed as well.) This has the following ef- 
fect: (a) the cutoff connectivity K reduces to some new 
value, K < K, and (b) the connectivity distribution of 
the remaining sites is no longer scale-free, but is changed, 
because of the removal of many of their links. The upper 
cutoff K before the attack may be estimated from 
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k=K 



where N is the total number of sites in the network. Simi- 
larly, the new cutoff K, after the attack, can be estimated 
from 



K oo 1 

E = E p ( fc ) -jf = 



p ■ 



(7) 
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If the size of the system is large, N ^> 1/p, the original 
cutoff K may be safely ignored. We can then obtain K 
approximately by replacing the sum with an integral || : 



K = mp 1/(1 - 



(8) 



We estimate the impact of the attack on the distribu- 
tion of the remaining sites as follows. The removal of a 
fraction p of the sites with the highest connectivity re- 
sults in a random removal of links from the remaining 
sites — links that had connected the removed sites with 
the remaining sites. The probability p of a link leading 
to a deleted site equals the ratio of the number of links 
belonging to deleted sites to the total number of links: 



^ kP(k) 

k=K 



(9) 



where (ko) is the initial average connectivity. With the 
usual continuous approximation, and neglecting K, this 
yields 
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p= _ =p(2-«)/(l-«) 



(10) 



for a > 2. For a = 2, p — > 1, since just a few nodes of 
very high connectivity control the entire connectedness 
of the system. Indeed, consider a finite system of N sites 
and a = 2. The upper cutoff K w JV must then be taken 
into account, and approximating Eq. (^|) by an integral 
yields p = ln(Np/m). That is, for a — 2, very small val- 
ues of p are needed to destroy an arbitrarily large fraction 
of the links as N — > oo. 



With these results known we can compute the effect 
of intentional attack, using the theory previously devel- 
oped for random removal of sites 0. Essentially, the 
network after attack is equivalent to a scale-free network 
with cutoff K, that has undergone random removal of a 
fraction p of its sites. This can be seen as the result of 
two processes: (a) Removal of the highest connectivity 
sites reduces the upper cutoff. Since this effect changes 
the connectivity distribution, kq needs to be recalculated 
accordingly, (b) Removal of the links leading to the re- 
moved sites. The probability of removing a link is p — 
the probability of a randomly chosen link to lead to one 
of the removed sites — and all links have the same proba- 
bility of being deleted. Since this effect has the influence 
on the probability distribution described in Eq. (|^), the 
result in Eq. (§) can be used, with p replacing p. (Notice 
that for random site deletion the probability of a link 
leading to a deleted site is identical to the fraction of 
deleted sites.) 

Although the number of nodes removed in intentional 
attack is different than in the random breakdown model, 
this affects the size of the spanning cluster (see below) 
but not the critical point. This is because the transition 
point is defined as the point where the spanning cluster 
becomes a finite fraction of the whole network. A finite 
fraction of the remaining nodes is also a finite fraction 
of the original network, so the difference has no effect on 

Pc- 

We therefore use Eqs. (||) and (||), but with p = 
(K/m) 2 ~ a and K replacing p c and K . This yields the 
equation: 
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(11) 



which can be solved numerically to obtain K(m, a), and 
then p c (m, a) can be retrieved from Eq. (||). In Fig. [j] we 
plot p c — the critical fraction of sites needed to be re- 
moved in the sabotage strategy to disrupt the network — 
computed in this fashion, and compared to results from 
numerical simulations. A phase transition exists (at a 
finite p c ) for all a > 2. The decline in p c for large a is 
explained from the fact that as a increases the spanning 
cluster becomes smaller in size, even before attack. (Fur- 
thermore, for m < 2 the original network is disconnected 
for some large enough a.) The decline in p c as a — ► 2 
results from the critically high connectivity of just a few 
sites: their removal disrupts the whole network. This 
was already argued in Q. We note that for infinite sys- 
tems p c — > as a — > 2. The critical fraction p c is rather 
sensitive to the lower connectivity cutoff m. For larger 
m (the case of m = 1 is shown in Fig. |l|) the networks 
are more robust, though they still undergo a transition 
at a finite p c . 

The size of the spanning cluster as a fraction of the 
number of undeleted sites, Poo(p), can be calculated us- 
ing the methods introduced in M and developed in P,K[ . 
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Following closely the derivation in || , a generating func- 
tion is built for the connectivity distribution: 



G (x) =J2P(k)o 



(12) 



fc=0 



The probability of reaching a site with connectivity k 
by following a specific link is kP(k)/ (k) and the 

corresponding generating function is 
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kP(k) dx 



G (x)/(k) 



(13) 



Hence, the generating function for the probability of 
reaching a branch of a given size by following a link is 



H^x) =xG 1 {H 1 (x)) , 



(14) 



while the generating function for the size of a component 
is 



H (x) =xG (H 1 (x)) 



(15) 



Then, = 1 — Hq(1), since Hq contains only the finite- 
size clusters. It follows that 



Poo( P ) = l-J2 P ( k > k > 



(16) 



k=0 



where u = -ffi(l) is the smallest positive root (found nu- 
merically) of 



(k)u = ^kP{k)i 

k=0 



k-1 



(17) 



We argue that the same holds true after attack, but the 
sums in Eqs. (16) and ( |l7| ) should run only up to k = K, 
and the original distribution P(k) should be replaced by 
the new connectivity distribution B : 



P{k) 



J2 p(ko)( k °)(i~P) k P k <>- k 



(18) 
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The actual fraction of removed sites, p, is inconsequential 
since we calculate the size of the infinite cluster relative 
to the number of undeleted sites. Poo evaluated in this 
fashion,and compared to numerical simulations, is shown 
in Fig. 

Since intentional attack leads always to a finite cutoff 
which does not scale with the system size, all the mo- 
ments are finite and Eqs. (16) and ( Jl7| ) are well behaved. 
Therefore, there always exists a linear term in the se- 
ries expansion of Poo(p). Hence, near the critical point 
Poo ~ \Pc~ pf, where = 1 @. 

Finally, we consider the distance between sites in the 
spanning cluster. The behavior of this quantity in diluted 
networks is different from that at the highly connected 
regime. The average distance between two sites in the 



spanning cluster of a highly connected network is propor- 
tional to log fe N, where k is the average connectivity and 
N is the number of sites (9). This has also been shown 
to hold for scale-free and general networks js) . How- 
ever, the diluted case is essentially the same as infinite- 
dimensional percolation. In this case, there is no notion 
of geometrical distance (since the graph is not embedded 
in an Euclidean space) , but only of a distance along the 
graph (which is the shortest distance along bonds). It is 
known from infinite-dimensional percolation theory that 
the fractal dimension at criticality is df = 2 |ll|] . There- 
fore the average (chemical) distance d between pairs of 
sites on the spanning cluster at criticality behaves as 



(19) 



where M is the number of sites in the spanning clus- 
ter. This is analogous to percolation in finite dimensions, 
where in lengthscales smaller than the correlation length 
the cluster is a fractal with dimension d{ and above the 
correlation length the cluster is homogeneous and has 
the dimension of the embedding space. In our infinite- 
dimensional case, the crossover between these two behav- 
iors occurs around the correlation length £ \p c — p\, as 
can be seen in Fig. ||[ 

In summary, we have shown that scale-free networks 
are highly sensitive to intentional attack. This is true 
even for networks with a < 3, which are known to be 
resilient to random removal of their sites. The high sen- 
sitivity near a = 2 results from the presence of just a 
few sites with connectivity comparable to the size of the 
system: their removal disrupts the whole network. We 
note that while the cutoff K must reach a typically small 
number before the network is disrupted, this is achieved 
with a modest removal fraction p (Eq. |sj> . The effect of 
sabotage on the connectivity distribution of the remain- 
ing sites after the attack and thus the relation between 
p, K, and p (Eqs. (0) and (||) ) is found explicitly in our 
approach. This effect is particularly important near the 
borderline case of a = 2. We have also shown that the 
average distance between pairs of sites in the spanning 
cluster grows dramatically near criticality. This makes 
communication very inefficient, even before the spanning 
cluster is completely disrupted. 
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FIG. 1. Critical probability, p c , as a function of a, for 
networks of size N = 500,000 (circles) and N = 64,000 
(squares). Lines represent the analytical solution, obtained 
from Eqs. (|) and (|) . 
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FIG. 2. Fraction of sites belonging to the spanning clus- 
ter, Poo, as a function of the fraction of removed sites, p, 
for networks with a — 2.5 (circles), a = 2.8 (squares), and 
a — 3.3 (diamonds). Lines represent the analytical result 
from Eqs. (Q and (0). Both the simulation and analysis 
are for system size N = 500, 000. 



FIG. 3. Mass (number of sites), M, as a function of dis- 
tance, d, on the spanning cluster. The correlation length is 
£ = \p — p c | _1 . Note that for d/£ < 1, the slope is 2, cor- 
responding to the behavior in the critical regime, while for 
d/£ > 1, M grows exponentially with d, corresponding to the 
well connected regime. 
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